Decoder Technology for Connectionist Large Vocabulary Speech Recognition

نویسندگان

Steve Renals

Mike Hochberg

چکیده

The search problem in large vocabulary continuous speech recognition (LVCSR) is to locate the most probable string of words for a spoken utterance given the acoustic signal and a set of sentence models. Searching the space of possible utterances is difficult because of the large vocabulary size and the complexity imposed when long-span language models are used. This report describes an efficient search procedure and its software embodiment in a decoder, NOWAY, which has been incorporated in ABBOT, a hybrid connectionist/ hidden Markov model (HMM) LVCSR system [15]. The search algorithm is based on stack decoding and uses both likelihoodand posterior-based pruning. The use of the posterior-based phone deactivation pruning techniques is well-suited to hybrid connectionist/HMM systems because posterior phone probabilities are directly computed by the connectionist acoustic model. The single-pass decoder has been evaluate on the large vocabulary North American Business News task using a 20,000 word vocabulary and a trigram language model. These results indicate that phone deactivation pruning increased the search speed by an order of magnitude while incurring 2% or less relative search error. Using a pentium-based PC system, evaluation quality decoding (less than 3% relative search error) was available with execution speeds 2–5 times slower than realtime, and realtime decoding was available at the cost of 4–12% relative search error.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

The ISIP Public Domain Decoder for Large Vocabulary Conversational Speech Recognition

متن کامل

Recent improvements to the ABBOT large vocabulary CSR system

ABBOT is the hybrid connectionist-hidden Markov model (HMM) large-vocabulary continuous speech recognition (CSR) system developed at Cambridge University. This system uses a recurrent network to estimate the acoustic observation probabilities within an HMM framework. A major advantage of this approach is that good performance is achieved using context-independent acoustic models and requiring m...

متن کامل

Start-synchronous search for large vocabulary continuous speech recognition

In this paper, we present a novel, efficient search strategy for large vocabulary continuous speech recognition. The search algorithm, based on a stack decoder framework, utilizes phone-level posterior probability estimates (produced by a hybrid connectionist/HMM acoustic model) as a basis for phone deactivation pruning — a highly efficient method of reducing the required computation. The singl...

متن کامل

Towards large vocabulary ASR on embedded platforms

In this paper we present an overview of an automatic speech recognition system implementation in the context of embedded systems. Specific challenges presented by low resource platforms will be addressed for the basic components of an ASR decoder. Our main objective is to utilize and modify the technology developed for large vocabulary ASR to achieve efficient LVCSR on embedded systems as well.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1995

Decoder Technology for Connectionist Large Vocabulary Speech Recognition

نویسندگان

چکیده

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

The ISIP Public Domain Decoder for Large Vocabulary Conversational Speech Recognition

Recent improvements to the ABBOT large vocabulary CSR system

Start-synchronous search for large vocabulary continuous speech recognition

Towards large vocabulary ASR on embedded platforms

عنوان ژورنال:

اشتراک گذاری